# Swin-GPT2 Architecture
Im2latex Base
A VisionEncoderDecoder model for generating LaTeX formulas from images, utilizing Swin Transformer encoder and GPT-2 decoder architecture
Image-to-Text
Transformers

I
Matthijs0
56
1
Vit Swin Base 224 Gpt2 Image Captioning
MIT
An image caption generation model based on the VisionEncoderDecoder architecture, using Swin Transformer as the visual encoder and GPT-2 as the decoder, fine-tuned on the COCO2014 dataset
Image-to-Text
Transformers English

V
Abdou
321
2
Featured Recommended AI Models